All Questions
Tagged with tensorflownatural-language-processing
16 questions
1vote
0answers
23views
Convolutional network for multilabel classification in NLP
I am trying to label code snippets and I base on this article: https://arxiv.org/pdf/1906.01032.pdf My dataset is just code snippets (tokenized as ascii characters) and 500 different labels from ...
0votes
2answers
5kviews
How to fine-tune GPT-J with small dataset
I have followed this guide as closely as possible: https://github.com/kingoflolz/mesh-transformer-jax I'm trying to fine-tune GPT-J with a small dataset of ~500 lines: ...
1vote
1answer
3kviews
0votes
2answers
4kviews
How to train an LSTM with varying length input?
I have a dataset where each of the training instances is different in the length and the data is sequential. So, I design an LSTM but I am thinking about how to train the LSTM. In fixed-length data, ...
3votes
1answer
793views
How large should the corpus be to optimally retrain the GPT-2 model?
I just started working with the GPT-2 models and want to retrain one on a pretty narrow topic, so I have problems finding training material. How large should the corpus be to optimally retrain the GPT-...
1vote
0answers
131views
Embedding Layer into Convolution Layer
I'm looking to encode PDF documents for deep learning such that an image representation of the PDF refers to word embeddings instead of graphic data So I've indexed a relatively small vocabulary (88 ...
1vote
0answers
43views
Low accuracy during training for text summarization
I am trying to implement an extractive text summarization model. I am using keras and tensorflow. I have used bert sentence embeddings and the embeddings are fed into an LSTM layer and then to a Dense ...
2votes
2answers
4kviews
Why is my loss (binary cross entropy) converging on ~0.6? (Task: Natural Language Inference)
I’m trying to debug my neural network (BERT fine-tuning) trained for natural language inference with binary classification of either entailment or contradiction. I've trained it for 80 epochs and its ...
1vote
0answers
232views
Simple sequential model with LSTM which doesn't converge
I'm actually trying to create a sequential neural network in order to translate a "human" sentence in a "machine" sentence understandable by an algorithm. Like It didn't work, I've try to create a NN ...
3votes
0answers
497views
How to use TPU for real-time low-latency inference?
I use Google's Cloud TPU hardware extensively using Tensorflow for training models and inference, however, when I run inference I do it in large batches. The TPU takes about 3 minutes to warm up ...
1vote
2answers
449views
Why is embedding important in NLP, and how does autoencoder work?
People say embedding is necessary in NLP because if using just the word indices, the efficiency is not high as similar words are supposed to be related to each other. However, I still don't truly get ...
0votes
1answer
58views
How to change this RNN text classification code to become text generation code?
I can do text classification with RNN, in which the last output of RNN (rnn_outputs[-1]) is used to matmul with output layer weight and plus bias. That is getting a word (class name) after the last T ...
1vote
1answer
72views
Do I need to use a pre-processed dataset to classify comments?
I want to use Machine Learning for text classification, more precisely, I want to determine whether a text (or comment) is positive or negative. I can download a dataset with 120 million comments. I ...
5votes
1answer
175views
Which model should I use to determine the similarity between predefined sentences and new sentences?
The Levenshtein algorithm and some ratio and proportion may handle this use case. Based on the pre-defined sequence of statements, such as "I have a dog", "I own a car" and many more, I must ...
2votes
0answers
44views
Sequence to sequence machine learning / NMT - converting numbers into words
I want to do some sequence to sequence modelling on source data that looks like this: /-0.013428/-0.124969/-0.13435/0.008087/-0.269241/-0.36849/ with target data ...